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Abstract 

The  standard  Lagrange  multiplier  test  for  heteroskedasticity  was  originally  developed  assuming  nor- 
mality of  the  disturbance  term  [see  Godfrey  (1978b),  and  Breush  and  Pagan  (1979)].  Therefore, 
the  resulting  test  depends  heavily  on  the  normality  assumption.  Koenker  (1981)  suggests  a  stu- 
dentized  form  which  is  robust  to  nonnormality.  This  approach  seems  to  be  limited  because  of  the 
unavailability  of  a  general  procedure  that  transforms  a  test  to  a  robust  one.  Following  Bickel  (1978), 
we  use  a  different  approach  to  take  account  of  nonnormality.  Our  tests  will  be  based  on  the  score 
function  which  is  defined  as  the  negative  derivative  of  the  log-density  function  with  respect  to  the 
underlying  random  variable.  To  implement  the  test  we  use  a  nonparametric  estimate  of  the  score 
function.  Our  robust  test  for  heteroskedasticity  is  obtained  by  running  a  regression  of  the  product 
of  the  score  function  and  ordinary  least  squares  residuals  on  some  exogenous  variables  which  are 
thought  to  be  causing  the  heteroskedasticity.  We  also  use  our  procedure  to  develop  a  robust  test  for 
autocorrelation  which  can  be  computed  by  regressing  the  score  function  on  the  lagged  ordinary  least 
squares  residuals  and  the  independent  variables.  Finally,  we  carry  out  an  extensive  Monte  Carlo 
study  which  demonstrates  that  our  proposed  tests  have  superior  finite  sample  properties  compared 
to  the  standard  tests. 


1      Introduction 

Conventional  model  specification  tests  are  performed  with  some  parametric,  usually  the  Gaussian, 
assumptions  on  the  stochastic  process  generating  a  model.  These  parametric  specification  tests  have 
the  drawback  of  having  incorrect  sizes,  suboptimal  power  or  even  being  inconsistent  when  any  of 
the  parametric  specifications  of  the  stochastic  process  is  incorrect,  [see  Box  (1953),  Tukey  (1960), 
Bickel  (1978)  and  Koenker  (1981)  for  theoretical  arguments,  and  Bera  and  Jarque  (1982),  Bera  and 
McKenzie  (1986),  and  Davidson  and  MacKinnon  (1983)  for  Monte  Carlo  evidence].  In  this  paper, 
we  use  a  nonparametric  estimate  of  score  function  to  develop  some  tests  for  heteroskedasticity  and 
autocorrelation  which  are  robust  to  distributional  misspecifications. 

The  importance  of  the  score  function,  defined  as  tjj(x)  =  —  log /'(a:)  =  —  OV ,  where  f(x) 
is  the  probability  density  function  of  a  random  variable,  to  robust  statistical  procedures  has  been 
sporadically  mentioned,  implicitly  or  explicitly,  throughout  the  past  few  decades  [see,  e.g.,  Ham- 
pel  (1973,  1974),  Bickel  (1978),  Koenker  (1982),  Joiner  and  Hall  (1983),  Manski  (1984),  and  Cox 
(1985)].  Only  during  the  past  decade,  numerous  works  were  done  on  nonparametric  estimation  of 
the  score  function,  [see  Stone  (1975),  Csorgo  and  Revesz  (1983),  Manski  (1984),  Cox  (1985),  Cox 
and  Martin  (1988),  and  Ng  (1991a,  1991b)].  These  facilitate  our  development  of  nonparametric  tests 
of  specifications  using  the  score  function  without  making  any  explicit  parametric  assumption  on  the 
underlying  distribution.  Therefore,  we  expect  our  procedures  to  be  immune  to  loss  of  powers  and 
incorrect  sizes  caused  by  distributional  misspecifications. 

The  use  of  the  score  function  in  the  context  of  model  specification  testing  is  not  new.  Robustifying 
the  procedures  of  Anscombe  (1961)  and  Anscombe  and  Tukey  (1963),  Bickel  (1982)  derives  the  test 
statistics  for  testing  nonlinearlity  and  heteroskedasticity  which  implicitly  use  the  score  function,  [see 
also  Pagan  and  Pak  (1991)].  In  this  paper,  we  follow  the  Lagrange  multiplier  test  procedure  and 
derive  the  test  statistics  which  turn  out  to  be  functions  of  the  score  function. 

Our  nonparametric  test  for  heteroskedasticity  is  obtained  by  running  a  regression  of  the  product 
of  the  score  function  and  the  ordinary  least  squares  residuals  on  some  exogenous  variables  which  are 
thought  to  be  causing  the  heteroskedasticity.  The  nonparametric  autocorrelation  test  is  performed 
by  regressing  the  score  function  on  the  lagged  residuals  and  the  independent  variables,  which  may 
include  lagged  dependent  variables.  We  also  show  in  the  paper  that  when  normality  assumption  is 
true,  our  tests  for  heteroskedasticity  and  autocorrelation  reduce  to  the  familiar  Breusch  and  Pagan 
(1979)  or  Godfrey  (1978b)  tests  for  heteroskedasticity  and  Breusch  (1978)  or  Godfrey  (1978a)  tests 
for  autocorrelation  respectively. 

We  perform  an  extensive  Monte  Carlo  study  which  demonstrates  that  our  proposed  tests  have 
superior  finite  sample  properties  compared  to  the  standard  tests  when  the  innovation  deviates  from 
normality  while  still  retain  comparable  performances  under  the  normal  innovation. 

The  model  and  the  test  statistics  are  introduced  and  defined  in  Section  2.  In  Section  3,  we  derive 
the  one-directional  test  statistics  for  heteroskedasticity  and  autocorrelation.  Section  4  gives  a  brief 
review  of  existing  score  function  estimation  techniques  and  a  description  of  the  score  estimator  used 
in  the  Monte  Carlo  study.  The  finite  sample  performances  of  the  conventional  test  statistics  and  our 
proposed  nonparametric  tests  are  reported  in  Section  5. 


2     The  Model  and  the  Test  Statistics 

2.1      The  Model 

In  order  to  compare  our  findings  with  those  of  previous  studies,  we  consider  the  following  general 
model  which  incorporates  various  deviations  from  the  classical  linear  regression  model 

y(L)  y{      =      x'i0  +  u,  i  =   l,...,n  (1) 

S(L)  m      =      a  (2) 

where  y,  is  a  dependent  variable,  x,  is  a  Jb  x  1  vector  of  non-stochastic  explanatory  variables,  /?  is  a 
k  x  1  vector  of  unknown  parameters,  and  y(L)  and  S(L)  are  polynomials  in  the  lag  operator  with 

m 

7(1)      =       1   "   E  Ti^ 

i  =  i 

S(L)      =      1   -  f^SjU  . 

The  normalized  innovation  term  is  defined  as  z,  =  j*-.  The  innovation  e,  is  independently  distributed 
and  has  a  symmetric  probability  density  function  /«(€»•)  =  -£-fz(jf-)  with  the  location  parameter 
assumed  to  be  zero  and  the  scale  parameter  taking  the  form 


r,-    =    yJhWa) 


in  which  w,-  is  a  q  x  1  vector  of  fixed  variables  having  one  as  its  first  element,  a'  =  (ai,  a'2)  is  a  q  x 
1  vector  of  unknown  parameters,  and  h  is  a  known,  smooth  positive  function  with  continuous  first 
derivative.  The  score  function  of  the  innovation  e;  is  defined  as 

MU)  (Ti  fz(zi)  <Ji  <Ti 

Model  (1)  and  (2)  can  be  written  more  compactly  as 

(4) 
(5) 


y» 

= 

Y/y  +  x'i0  +  m 

Ui 

= 

U[6  +  ei 

where 

Yi 
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(yi-i,--,yi-m)' 

Ui 

= 

(ttt'_  !,...,  U.-p)' 

7 

= 

(7l,---,7m)' 

6 

= 

(Si,.. .,spy . 

In  matrix  form  the  model  is 

y      =       Yy   +   X/3   +   u   =    Wr   +   u 
u      =      US  +  e 
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2.2     Test  Statistics 

Most  conventional  hypotheses  testings  utilize  the  likelihood  ratio  (LR),  Wald  (W),  or  Lagrange 
multiplier  (LM)  principle.  Each  has  its  own  appeals.  The  LR  test  is  favorable  when  a  computer 
package  conveniently  produces  the  constrained  and  unconstrained  likelihoods.  The  Wald  test  is 
preferable  when  the  unrestricted  MLE  is  easier  to  estimate.  In  model  specification  tests,  LM  is  the 
preferred  principle  since  the  null  hypotheses  can  usually  be  written  as  restricting  a  subset  of  the 
parameters  of  interest  to  zero  and  the  restricted  MLE  becomes  the  OLS  estimator  for  the  classical 
normal  linear  model. 

Even  though  our  nonparametric  approach  to  specification  tests  does  not  lead  to  the  OLS  estima- 
tor for  the  restricted  MLE  under  the  null  hypothesis,  we  will  demonstrate  that  LM  test  can  still  use 
the  OLS  or  some  other  consistent  estimators  and  specification  tests  can  be  performed  conveniently 
through  most  of  the  popular  computer  packages.  For  this  reason,  we  concentrate  solely  on  deriving 
the  LM  test  statistics  in  this  paper. 

Let  /,(#)  be  the  log-density  of  the  tth  observation,  where  6  is  a  sxl  vector  of  parameters.  The 
log-likelihood  function  for  the  n  independent  observations  is  then 


/   =    1(0)    =    £/,(*) 


i  =  l 


The  hypothesis  to  be  tested  is: 

Ho  :     h{9)   =   0  . 

where  h(6)  is  an  r  x  1  vector  function  of  8  with  r  <  s.  We  denote  H(9)    =   dh(9)/d6'  and  assume 
that  rank(i/)   =    r,  i.e.  there  are  no  redundant  restrictions.  The  LM  statistic  is  given  by 

LM   =   d'i-xd 


where  d   =   d(0)    =   dl/d6  is  the  score  vector, 
I   =  1(6)    =    Var[d(9))    =    -E{^-) 


'8989''  l89d9' 

is  the  information  matrix  and  the  '  "  's  indicate  that  the  quantities  are  evaluated  at  the  restricted 
MLE  of  9.  Under  H0,  LM  is  distributed  as  Xr  asymptotically. 


3      Specification  Tests 

The  usual  one-directional  specification  tests  of  the  model  given  by  (1)  and  (2)  in  Section  2.1  involve 
testing  the  following  hypotheses: 

1.  Homoskedasticity  (H):      Hq  :  c*2    =   0,  assuming  6   =   0. 

2.  Serial  Independence  (I):      Hq  .6   =   0,  assuming  a?   =  0. 


3.1      Test  for  Heteroskedasticity 

Breusch  and  Pagan  (1979)  derived  the  LM  test  statistic  for  testing  the  presence  of  heteroskedasticity 

under  normality  assumption.    Here  we  provide  the  full  derivation  for  the  LM  statistic  since  the 

situation  is  somewhat  different  due  to  the  nonparametric  specification  of  the  innovation  distribution. 

Assuming  6    =    0,  the  p.d.f.  of  the  stochastic  process  specified  in  Section  2.1  can  be  written  as 

We  shall  partition  the  vector  of  parameters  of  model  (4)  and  (5)  into 
/    7    \ 


9   = 


ft 


*i 


The  log-likelihood  function  is  then  given  by 
1(9)      =      £{l°g/«(^)   -  log*} 


=      £     log/. 


«  =  1 


y/hWa) 


-   2  MM",'*)] 


The  score  vector  under  Ho  becomes 
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where  a2    =    h(a\),  «,-    =    y;   —   Y/j  —   x{/?,  &\,  7  and  0  are  the  restricted  MLE  obtained  as  the 
solutions  to  the  above  first  order  conditions. 
If  we  partition  the  information  matrix  into 


I   = 


^11       2"l2 

I21     222 


corresponding  to  9   =   (^ii^j)',  we  can  see  that 

d2l 


I12      —     I21   —  —  E 


dexd6'2 


-  A\t 


h'Wa) 


1=1 


<7,  <J{  <T:  <Ti 


(6) 


With  the  assumption  of  a  symmetric  p.d.f.  for  Ui,  X\i   =  I'21    =   0  due  to  the  fact  that  both  terms 
in  (6)  are  odd  functions.  The  lower  right  partition  of  1  is  given  by 

I22    =    Var[d7(0)]    =    V«r[^] 
Letting  c,    =    j*  [v'2a)  and  g,    =   tf*(^)(;^)    =   ip({ui)ui,  we  get 

n 

d2(6)   =   ^2ciVi(gi   -   1) 

»=i 

from  the  first  order  conditions.  This  gives  us 

n 
2"22    =    ^2clviVar(9iWi  . 


i=l 


Denoting  a2    =    Var(gi),  we  have 

n 
:  =  1 

We  can  estimate  a2  by  the  consistent  estimator 


-2    _    EIU  9j  (T!Li9i 


V 


and  get 


^2    _    ^21 


_  E"=i g»?      (TH=\h 


En       -2 
,  =  1  9i 


-    1 


since  Y17=\  9*   =   n  fr°m  tne  nrs^  order  condition  for  ai.  Let 


9   = 


(  h 


\    9n 


V    = 


1     = 


1 


Since  the  information  matrix  is  block  diagonal,  the  LM  statistics  for  testing 


H0: 


0:  / 


a2 


=   0 


can  be  written  as 
LMH      = 


2^22  ^2 


=     -2(9  -  l)' V  {V'V}-1  V  (9  -  1) 

9 

=      L{g'V{V'V)-xV'g  -  2g'l  +  n\ 

=   X{rv(vvrlvg  -  ri(i'irli'f} 

9 

If  we  substitute  <7j  for  o-|  into  LMh  ,  we  get 

LMh    =    ^{g'ViV'VyW'g   -  g'lil'lf'l'g] 

a~9 

The  LMh  test  is  not  feasible  and  neither  is  LMh  because  the  score  function  r/f  of  the  innovation 
is  unknown  and,  hence,  prevents  us  from  solving  for  the  restricted  MLE  on,  7,  and  /?.  To  obtain  a 
feasible  version  of  the  LMh  statistic,  let  7  and  0  be  any  weakly  consistent  estimators,  e.g.  the  OLS 
estimators,  for  7,  and  /?  respectively,  and  4>*  be  a  weakly  consistent  estimator  for  the  true  score 
function  t/>£  over  the  interval  [Am,  &(„)]•    Here  ti(i)  and  ti(n)  are  the  extreme  order  statistics  of  the 

consistent  residuals.   Denoting  <7,    =    tp*  («,-)  («,-)  and  <7?    —    *-£  '     —    1,  we  define  our  operational 
form  of  the  LM  statistic  as 


LM 


H      =      ±{g'V(V'VrlV'g   -  g'lil'iy'l'g} 


=      nR2 


where  R?  is  the  centered  coefficient  of  determination  from  running  a  regression  of  g  on  V . 

We  now  demonstrate  that  under  the  null  hypothesis,   LM h  is  asymptotically  distributed  as 


Xo-i-   Since  LMh  is  the  standard  Lagrange  multiplier  statistic,  under  Ho,  LMh 


Xg-\- 


Under 


homoskedasticity,  we  are  in  an  IID  set  up  and  hence  a-—a-—  op(l).  Hence,  under  the  null,  LMh 
and  LM h  will  have  the  same  asymptotic  distribution.  Next  we  show  the  asymptotic  equivalence  of 


8 


LM  h  and  LA///.  First,  we  note  that,  under  Ho,  u,  —  u,-  =  op(l).  Since  xp(  is  a  continuous  function, 
xpi(iii)  —  xpc(ui)  =  op(l).   With  t/>*  being  consistent  over  [«( j ),  U(n)]»  9i  —  9i  =  °p0).  and  therefore 

op(l).    Next  we  consider  the  numerators  of  LMh  and  LMh-    These  numerators  are 


sA 


<r%  —  (T- 


2  _ 


based  on  the  OLS  regression  of,  respectively  g  and  g  on  V.    Let  us  denote  fj  =  (V'V)    1V'g  and 
r)  =  (V'V)~1V'g.  Now  denoting  d  =  g  —  g,  we  have 


T)-7]       = 


V'V 
n 

V'V 


-1 


-1 


yd 

n 


1     " 


i  =  l 


Cox  (1985,p.276)  showed  that  |cf,|  =  \gi  -  g{\  =  Op(n-s)  for  0  <  8  <  \.  Therefore,  ^  £"=1  |<f,|  = 
Op(\).  Suppose  |vi|, . . .,  |vn|  are  bounded  by  m  <  00,  then  we  can  write 


J2Vidi 


t=i 


1    " 
n  f— * 

i  =  l 

i  =  l 

1  " 


i  =  l 


=       0P(1) 


This  establishes  that  LMh  and  LMh  are  asymptotically  equivalent,  and  hence,  under  //o, 

S/z^xJ-!- 

Several  interesting  special  cases  can  easily  be  derived  from  LM h  assuming  different  specifica- 
tion for  fe((i).    For  example,  under  the  normality  assumption  on  ft((i),  ^("i)     =     "i/o-2,  and 

LM h  —  LMbp    —*  0,  where  LMbp  is  the  LM  statistic  for  testing  heteroskedasticity  in  Breusch 

and  Pagan  (1979).  If  /«(c»)  is  a  double  exponential  distribution  [Box  and  Tiao  (1973,  p. 157)],  LMh 
asymptotically  becomes  the  Glesjer's  (1969)  statistic  which  regresses  |u,|  on  uj,  [see  Pagan  and  Pak 

(1991)].  Finally,  for  the  logistic  innovation,  our  LM h  statistic  is  obtained  by  regression  ti,  I e i\  ~  ■  J 
on  V{.  Note  that  the  score  functions  for  the  double  exponential  and  logistic  distributions  are  bounded, 
and  therefore,  the  latter  two  tests  might  perform  better  for  fat  tailed  distributions. 

3.2     Test  for  Serial  Correlation 

Given  the  model  specified  by  (4)  and  (5)  along  with  the  assumption  a  =  0,  the  null  hypothesis  for 
no  serial  independence  is 


H0:        6    =    0. 


Writing 


/    *    \ 


9  = 


7 

0 

\    *    J 

our  model  for  testing  serial  independence  can  be  written  as 

Vi    =    qi(Wi,Ui\02)   +   e, 


(7) 


where  92  is  a  (m  +  k  +  p)  x  1  vector  and  the  e,'s  are  I.I.D.  with  symmetric  p.d.f.  ft((i)   =   j-fzij^), 
in  which  9\  is  the  scale  parameter. 
The  log-likelihood  function  is 


m  =  E 


log/,   (£)     -    log* 

and  the  first  order  conditions  for  the  restricted  MLE  are 
81 


0 


a7 
a/ 

where  the  '~'s  again  denote  quantities  evaluated  at  the  restricted  MLE,  u,    =    y,    —   Y/y   —   Xf0, 
and  £/,    =   (uj_1)...,u,_p)'. 

With  the  symmetry  assumption  on  ft((i),  as  before,  it  can  be  easily  proved  that 

E{d2l{9)ld02dOx)   =   0. 

We,  therefore,  only  need  to  evaluate  d2  and  222  if  we  are  testing  for  restrictions  on  92.  Denoting  Q 
asanx(m  +  ib-fp)  matrix  with  the  ith  row  being  dqi(Wi,  C/,;  92)/d9'2  and  ^  a  n  x  1  vector  with 

elements  ^,    =    j-^t  (?*■)    =    ^<(e')>  we  have 


and 


E[d2d'2]      =      Q'(EW)Q 
=      a\Q'Q 


10 


where  <r\    =   E(^i)2.  The  LA//  statistic  for  testing  Hq  :   6   =   0  is  given  by 


LMi   = 


'I 


where  a\    =   E(*f) 


Letting  cA    =   ^j*  be  the  consistent  estimator  for  <r?  ,  we  have 

LAf/   =   To 

Similar  to  the  test  for  heteroskedasticity,  neither  LA//  nor  LAf/  is  feasible.  To  obtain  a  feasible 
version  of  the  LM  test,  let  0?  be  any  weakly  consistent  estimator  for  02,  V>*  be  a  weakly  consistent 
estimator  for  the  true  score  function  ip(  over  the  interval  [<(i),f(n)]>  *t  =  !/i  —  Qi(WitUi',02),  V 
a  n  x  1  vector  with  elements  ^,-  =  rp*  (f,),  Qanx(m  +  it  +  p)  matrix  with  the  t'th  row  being 
dqi(W{,  Ui;02)/d92  and  <H  =  W'W/n,  then  the  feasible  LM  statistic  for  testing  serial  independence 
in  model  (7)  is  given  by 

m,  =  *'»»<>)-'»*  =  „R> 

where  R2  is  the  uncentered  coefficient  of  determination  of  regressing  ^  on  Q. 

Notice  that  the  n  x  (m  +  k  -f  p)  matrix  Q  above  has  component  Qi  =  (V/,  x'it  U-).  This 
facilitates  the  following  simpler  LM  statistic. 


LMj      =      


UlU  -  U'W(W'W)~lW'U 


=   nR2 


(8) 

where  R2  is  the  uncentered  coefficient  of  determination  of  regressing  ¥  on  U  and  W  due  to  the 
orthogonality  given  in  the  first  order  condition  on  the  score  vector  under  Hq.  A  well  known  alter- 
native for  computing  the  LM /  statistic  is  to  regress  $  on  U  and  W  and  test  the  significance  of  the 
coefficients  of  U .  Following  similar  arguments  as  in  the  case  of  heteroskedasticity,  we  can  show  that 

under  serial  independence,  LA//  — »  \p- 

As  in  the  case  of  LMfj,  several  interesting  special  cases  can  be  obtained  from  LMj.    Under 

normality  assumption,  we  have  *,  =  e,  and  LA//  —  LMbg  —*■  0,  where  LMbg  is  the  LM  statistic 
for  testing  autocorrelation  in  Breusch  (1978)  and  Godfrey  (1978a).  The  test  can  be  performed  by 
regression  I  on  U  and  W .  When  the  density  of  the  innovation  is  double  exponential,  our  test  is 
performed  by  regressing  sign(ii)  on  U[  and  W-.  This  is  similar  to  the  sign  test  for  randomness  of  a 

process.  If  the  innovation  has  a  logistic  density,  our  LA//  test  is  equivalent  to  regressing  *Jx  ~  ■  on 
U'i  and  W[. 
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4     Score  Function  Estimation 

The  score  function  as  defined  in  (3)  plays  an  important  role  in  many  aspects  of  statistics.  It  can 
be  used  for  data  exploration  purpose,  for  Fisher  information  estimation  and  for  the  construction  of 
adaptive  estimators  of  semiparametric  econometric  models  in  robust  econometrics  [see  e.g.  Cox  and 

Martin  (1988)  and  Ng  (1991a)].  Here  we  use  it  to  construct  the  nonparametric  test  statistics  LM  h 

and  LM i. 

Most  existing  score  function  estimators  are  constructed  by  computing  the  negative  logarithmic 
derivative  of  some  kernel  based  density  estimators  [see  e.g.  Stone  (1975),  Manski  (1984),  and  Cox 
and  Martin  (1988)].  Csorgo  and  Revesz  (1983)  suggested  a  nearest-neighbor  approach.  Modifying 
the  approach  suggested  in  Cox  (1985),  Ng  (1991a)  implemented  an  efficient  algorithm  to  compute 
the  smoothing  spline  score  estimator  that  solved 


min      f(rl>2   -  1xl>')dFn  +  A  I ' N>" \x))2 dx 

*€H2[a,b)J  J 


(9) 


where  H2[a,b]  =  {ip  :  xp,  xl>'  are  absolutely  continuous,  and  fa[xjj"(x)]2dx  <  oo}.    The  objective 
function  (9)  is  the  (penalized)  empirical  analogue  of  minimizing  the  following  mean-squared  error: 


[{J   -   rPofdFo    =    J(rJ>2    -   1xl>')dF0   +     I $dF0 


(10) 


in  which  V'o  is  the  unknown  true  score  function  and  the  equality  is  due  to  the  fact  that  under  some 
mild  regularity  conditions  [see  Cox  (1985)] 


JrfioCdFo   =   -  J  f'0(x)C(x)dx   =   J  C 'dF0. 


Since  the  second  term  on  the  right  hand  side  of  (10)  is  independent  of  tp,  minimizing  the  mean- 
squared  error  may  focus  exclusively  on  the  first  term.  Minimizing  (9)  yields  a  balance  between 
"fidelity-to-data"  measured  by  the  mean-squared  error  term  and  the  smoothness  represented  by  the 
second  term.  As  in  any  nonparametric  score  function  estimator,  the  smoothing  spline  score  estimator 
has  the  penalty  parameter  A  to  choose.  The  penalty  parameter  merely  controls  the  tradeoff  between 
"fidelity-to-data"  and  smoothness  of  the  estimated  score  function.  An  automatic  penalty  parameter 
choice  mechanism  is  suggested  and  implemented  in  Ng  (1991a)  through  robust  information  criteria 
[see  Ng  (1991b)  for  a  FORTRAN  source  codes]. 

The  performances  of  the  kernel  based  score  estimators  depend  very  much  on  using  the  correct 
kernel  that  reflects  the  underlying  true  distribution  generating  the  stochastic  process  besides  choosing 
the  correct  window  width.  The  right  choice  of  kernel  becomes  even  more  important  for  observations 
in  the  tails  where  density  is  low  since  few  observations  will  appear  in  the  tail  to  help  smooth  things 
out.  This  sensitivity  to  correct  kernel  choice  is  further  amplified  in  score  function  estimation  where 
higher  derivatives  of  the  density  are  involved  [see  Ng  (1991a)].  It  is  found  in  Ng  (1991a)  that  the 
smoothing  spline  score  estimator  which  finds  its  theoretical  justification  from  an  explicit  statistical 
decision  criterion,  i.e.  minimizing  the  mean-squared  error,  is  more  robust  than  the  ad  hoc  estimators, 
like  the  kernel  based  estimators,  to  distribution  variations.  We,  therefore,  use  it  to  construct  our 
nonparametric  test  statistics. 
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Since  no  estimator  can  estimate  the  tails  of  the  score  function  accurately,  some  form  of  trimming 
is  needed  in  the  tails  where  observations  are  scarce  to  smooth  things  out.  Cox  (1985)  showed  that  the 
smoothing  spline  score  estimator  achieved  uniformly  weak  consistency  over  a  bounded  finite  support 
[ao,&o]  which  contains  the  observations  x\, . .  .,xn.  Denoting  the  solution  to  (9)  as  ip(x),  the  score 

estimator  used  in  constructing  our  nonparametric  statistics  LM  h  and  LM  i  given  in  Section  3  takes 
the  form 

fr(x\    _     /    0(«)      >f*(l)     <     x    <    *(n)  fin 

V  KX>   ~    \  0  otherwise  K     ' 

5      Small  Sample  Performances 

All  the  results  on  the  LM  statistics  discussed  earlier  are  valid  only  asymptotically.  We  would, 
therefore,  like  to  study  the  finite  sample  behavior  of  the  various  statistics  in  this  section.  We  are 
interested  in  the  closeness  of  the  distributions  of  the  statistics  under  the  null,  Ho,  to  the  asymptotic 
X2  distributions,  the  estimates  of  the  probabilities  of  Type-I  error  as  well  as  the  estimated  powers. 
The  LM  statistics  involved  in  this  simulation  are  LM]j  [given  in  Godfrey  (1978b),  and  Breusch 

and  Pagan  (1979)],  LM]  [given  in  Breusch  (1978),  and  Godfrey  (1978a)],  LM H  and  LMi.  For  the 
LM  statistics,  the  closeness  of  the  distributions  under  the  null  to  the  asymptotic  \2  distributions 
are  measured  by  the  Kolmogorov-Smirnov  statistics,  the  estimated  probabilities  of  Type-I  errors  are 
measured  by  the  portion  of  rejections  in  the  replications  when  the  asymptotic  \2  significant  values 
are  used,  and  the  estimated  powers  are  measured  by  the  number  of  times  the  test  statistics  exceeded 
the  corresponding  empirical  significant  points  divided  by  the  total  number  of  replications. 

We  are  using  the  simulation  models  of  Bera  and  Jarque  (1982)  and  Bera  and  McKenzie  (1986) 
so  that  our  results  can  be  compared  with  their  prior  findings.  The  linear  regression  model  is  given 
by 

4 

y.  =  y^jjfij  +  u, 

where  rxl  =  1,  ij2  are  random  variates  from  /V(10,25),  r,3  from  the  uniform  [7(7.5,12.5)  and 
1,4  from  Xio-  The  regression  matrix,  X ,  remain  the  same  from  one  replication  to  another.  Serial 
correlated  (I)  errors  are  generated  by  the  first  order  autoregressive  {AR)  process,  u,  =  pu,_i  + 
ii,  where  \p\  <  1.  As  in  Bera  and  Jarque  (1982),  and  Bera  and  McKenzie  (1986),  the  level  of 
autocorrelation  is  categorized  into  'weak'  and  'strong'  by  setting  p  —  p\  =  0.3  and  p  =  pi  =  0.7, 
respectively.  Heteroskedasticity  (//)  are  generated  by  E((i)  =  0  and  V(ii)  =  crj  =  25  +  rjv,, 
where  y/vl  ~  /V(10,25)  and  t)  is  the  parameter  that  determines  the  degree  of  heteroskedasticity, 
with  T)  =  tji  =  0.25  and  r;  =  772  =  0.85  represent  'weak'  and  'strong'  heteroskedasticity 
respectively.  In  order  to  study  the  robustness  of  the  various  test  statistics  to  distributional  deviations 
from  the  conventional  Gaussian  innovation  assumption,  the  non-normal  (N)  disturbances  used  are 
(1)  the  Student's  t  distribution  with  five  degrees  of  freedom,  £5,  which  represent  moderately  thick- 
tail  distributions,  (2)  the  log-normal,  log,  which  represent  asymmetric  distributions,  (3)  the  beta 
distribution  with  scale  and  shape  parameters  7,  B(l ',  7),  which  represent  distributions  with  bounded 
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supports,  (4)  the  50%  normal  mixture,  NM,  of  two  normal  distributions,  N(— 3, 1)  and  iV(3, 1), 
which  represents  bi-modal  distributions,  (5)  the  beta  distribution  with  scale  3  and  shape  11,  5(3, 11), 
which  represents  asymmetric  distributions  with  bounded  supports,  and  (6)  the  contaminated  normal, 
CN,  which  is  the  standard  normal  N(0, 1)  with  .05%  contamination  from  N(Q,9),  that  attempts  to 
capture  contamination  in  a  real  life  situation.  All  distributions  are  normalized  to  having  variance  25 
under  Hq.  Figure  1  presents  the  score  functions  of  all  the  above  distributions.  Notice  from  Figure 
1  that  distributions  with  thicker  tails  than  the  normal  have  receding  score  in  the  tails  while  those 
with  thinner  tails  than  the  normal  have  progressive  score  in  the  tails. 

The  experiments  are  performed  for  sample  size  AT  =  25,  50,  and  100.  The  number  of  replication 
is  250.  The  Komogorov-Smirnov  statistics  for  the  various  LM  statistics  are  reported  in  Table  1. 
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Table  1.  Kolmogorov-Smirnov  Statistics  for  Testing  Departures  from  \2  distribution 


Disturbance 

Sample  Size 

Distribution 

25 

50 

100 

AT(0,25) 

LM'H 

LM} 

LMh 

LMi 

.0450 
.0457 

.0734 

.0440 

.0429 
.0361 

.0504 

.0398 

.0510 
.0380 

.0288 

.0420 

h 

LM'H 

LM'j 

LMh 
LMi 

.0754 
.0707 

.0444 

.0454 

.1385 
.0351 

.0436 

.0293 

.1167 
.0660 

.0674 

.0706 

log 

LM'H 

LM'j 

LMh 
LMi 

.1787 
.0676 

.0440 

.0511 

.3005 

.4767 

.0680 
.0394 
.0568 

.0522 
.0371 
.0714 

5(7,7) 

LM'H 
LM'i 

LMh 
LMi 

.0512 
.0390 

.0399 

.0333 

.0504 
.0452 

.0653 

.0607 

.0620 
.0365 

.0472 

.0336 

NM 

LM'h 
LM'j 

LMh 
LMi 

.2372 
.0453 

.0386 

.0333 

.2837 

.3546 

.0242 
.0514 
.0509 

.0470 
.0424 
.0276 

5(3,11) 

LM'h 
LM'i 

LMh 

LMi 

.0393 
.0721 

.0487 

.0947 

.0817 
.0539 

.0987 

.0685 

.0379 
.0457 

.0301 

.0496 

CN 

LM'h 

LM'j 

LMh 
LMi 

.0464 
.0396 

.0447 

.0539 

.1104 
.0444 

.0416 

.0450 

.1685 

.0849 
.0387 
.0906 

The  5%  critical  values  for  the  Kolmogorov-Smirnov  statistic  for  the  sample  sizes  of  25,  50,  and  100 
are  .2640,  .1884  and  .1340  respectively  while  the  1%  critical  values  for  25,  50  andlOO  observations 
are  .3166,  .2260,  and  .1608  respectively  [Pearson  and  Hartley  (1966)].  In  Table  1,  the  Kolmogorov- 
Smirnov  statistics  that  are  significant  at  the  1%  level  are  boxed.    From  Table  1,  it  is  clear  that 
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no  significant  departure  from  the  asymptotic  x?  distribution  can  be  concluded  at  either  5%  or  1% 
level  of  significance  for  all  LM  statistics  under  N(0,2b),  .0(7,7),  and  0(3, 11).  The  departure  from 
the  x2  distribution  becomes  more  noticeable  for  LM*H  as  the  sample  size  gets  bigger  when  the 
disturbance  term  follows  the  log,  NM  or  CN  distributions.  This  is  illustrated  in  Figure  2  for  log 
and  Figure  3  for  the  NM  disturbance  terms;  both  sample  sizes  equal  100.  Both  figures  are  plots  of 

the  nonparametric  adaptive  kernel  density  estimates  of  LM^  and  LM  a  [see  Silverman  (1986)  for 
details  of  adaptive  kernel  density  estimation].  We  can  see  that  LMfj  has  thinner  tail  under  NM 
and  thicker  tail  under  log  than  the  asymptotic  \2  distribution.  This  suggests  that  under  the  null 
hypothesis  of  homoskedasticity  and  serial  independence,  the  distribution  of  the  conventional  LM 
statistic  for  testing  heteroskedasticity  deviates  away  from  the  x2  distribution  as  the  distribution  of 
the  disturbance  term  departs  further  from  the  normal  distribution  in  shape  while  our  nonparametric 
heteroskedasticity  test  statistics  are  more  robust  to  these  distributional  deviations.  From  Figures  2 

and  3,  it  is  clear  that  at  the  tails,  the  distributions  of  LM h  and  the  \\  are  verv  close.  To  maintain 
the  correct  size  of  a  test  statistic,  only  the  tail  of  its  distribution  matters.   As  we  will  see  later  in 

Table  2,  the  true  Type-I  error  probabilities  of  LM h  are  very  close  to  the  nominal  level  of  10%. 

Both  the  LMJ  and  LM i  statistics  seem  to  be  much  less  sensitive  to  distributional  deviations  in  the 
disturbance  term. 

The  estimated  probabilities  of  Type-I  error  for  the  LM  statistics  are  reported  in  Table  2.  The 
estimated  probabilities  are  the  portions  of  the  replications  for  which  the  estimated  LM  statistics 
exceed  the  asymptotic  10%  critical  values  of  the  x?  distributions.  Since  the  number  of  replica- 
tion is  250,  the  standard  errors  of  the  estimated  probabilities  of  Type-I  error  is  no  bigger  than 
v/0.5(l  -0.5)/250  ~  0.032. 
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Table  2.  Estimated  Probabilities  of  Type  I  Errors  for  the  LM  Statistics 


Disturbance 
Distribution 

Sample  Size 

25 

50 

100 

LMJ, 

.080 

.116 

.112 

JV(0,25) 

LMJ 
LMh 

.064 
.108 

.092 
.128 

.096 
.112 

LMj 

.076 

.092 

.100 

LMJ, 

.108 

.208 

.200 

h 

lmj 

.108 

.088 

.104 

LMh 

.108 

.088 

.068 

LMi 

.116 

.092 

.104 

LM'H 

.248 

.388 

.544 

log 

LMJ 

.084 

.080 

.060 

LMH 

.112 

.068 

.108 

LMi 

.100 

.136 

.104 

LM„ 

.076 

.072 

.068 

5(7,7) 

LMJ 

.084 

.124 

.100 

LMH 

.116 

.100 

.100 

LMj 

.072 

.124 

.096 

LMH 

.016 

.016 

.000 

NM 

LMJ 

.144 

.116 

.064 

LMh 

.120 

.084 

.108 

LMj 

.112 

.104 

.084 

LM'H 

.088 

.124 

.104 

B(3,ll) 

LMJ 

.076 

.080 

.104 

LMH 

.128 

.100 

.112 

LMj 

.072 

.076 

.100 

LM'H 

.144 

.204 

.228 

CN 

LMJ 

.100 

.064 

.140 

LMH 

.088 

.092 

.100 

LMj 

.092 

.068 

.144 

From  Table  2,  it  is  obvious  that  the  Type-I  error  probabilities  for  our  nonparametric  test  statistics, 

LM h  and  LM i  are  very  close  to  the  nominal  10%  level  under  almost  all  sample  sizes  and  distri- 
butions. On  the  other  hand,  the  true  sizes  for  LMH  could  be  very  high.  For  example,  when  the 
distribution  is  log,  for  sample  of  size  100,  LMJj  rejects  the  true  null  hypothesis  of  homoskedasticity 
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54%  of  the  times.  When  the  distribution  is  *5  or  CN ,  LM^  also  overly  rejects,  though  less  severely. 
As  we  have  noted  while  discussing  the  implications  of  Figure  2,  over  rejection  occurs  since  the  dis- 
tribution of  LMff  has  much  thicker  tail  when  the  normality  assumption  is  violated.  On  the  other 
hand,  the  effect  of  N M  distribution  on  LMfj  is  quite  the  opposite.  LM^  has  thinner  tail  than  \\ 
as  noted  in  Figure  3  resulting  in  very  low  Type-I  error  probabilities.  The  Type-I  error  probabilities 

for  LM h  is,  in  contrast,  very  close  to  the  nominal  significant  level  of  10%. 

As  we  observed  in  Table  1  that  LMj  is  not  as  sensitive  to  departures  from  normality  as  LMfj  is 
and  hence  the  deviations  from  the  10%  Type-I  error  probability  of  LMf  are  not  as  severe  as  those  of 
LM*H.  These  findings  are  consistent  with  those  of  Bera  and  Jarque  (1982)  and  Bera  and  McKenzie 
(1986),  in  which  the  LMfj  and  LMf  tests  have  incorrect  Type-I  error  probabilities  under  log  and 
<5  when  the  asymptotic  critical  values  of  the  x2  distribution  are  used. 

Given  the  above  results  that  the  estimated  probabilities  of  Type-I  error  for  the  various  LM 
statistics  are  different,  it  is  only  appropriate  to  compare  the  estimate  powers  of  the  LM  statistics 
using  the  simulated  critical  values.  The  100or%  simulated  critical  values  are  the  (1  —  a)  sample 
quantiles  of  the  estimated  LM  statistics.  The  estimated  powers  of  the  LM  statistics  are,  hence,  the 
number  of  times  the  statistics  exceed  the  (1  —  a)  sample  quantiles  divided  by  the  total  number  of 
replications.  The  a  used  in  our  replications  is  10%.  The  standard  errors  of  the  estimated  powers 
are  again  <  0.032.  The  estimated  powers  for  N  =  50  and  100  are  presented  in  Table  3  and  4 
respectively. 
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Table  3.  Estimated  Powers  for  the  LM  Statistics 
Number  of  Observations  =  50 


Disturbance 

Alternatives:  Hi 

Distributions 

HI{m) 

HI(rn) 

Hl(Pi) 

HI(/n) 

HI(vi,Pi) 

M(m,m) 

HI(V2,PI) 

HI(V7,P7) 

LM'H 

.592 

.832 

.104 

.084 

.548 

.292 

.740 

.372 

N(0,2S) 

LM] 

.112 

.112 

.552 

.996 

.576 

.996 

.564 

.992 

LMH 

.524 

.760 

.100 

.088 

.456 

.272 

.688 

.376 

LMj 

.116 

.108 

.568 

.976 

.568 

.968 

.552 

.968 

LM'H 

.448 

.608 

.096 

.052 

.388 

.132 

.540 

.224 

h 

LM] 

.112 

.120 

.504 

1.00 

.524 

1.00 

.536 

1.000 

LMH 

.432 

.608 

.104 

.096 

.396 

.236 

.600 

.348 

LM, 

.132 

.140 

.504 

.984 

.532 

.972 

.544 

.960 

LM], 

.220 

.292 

.084 

.032 

.164 

.072 

.268 

.084 

log 

LM] 

.108 

.116 

.584 

1.00 

.572 

1.00 

.576 

1.00 

LM„ 

.600 

.752 

.124 

.132 

.472 

.220 

.656 

.272 

LM! 

.092 

.076 

.748 

.940 

.716 

.956 

.656 

.960 

LM'H 

.660 

.896 

.100 

.060 

.624 

.252 

.828 

.448 

B(7,7) 

LM] 

.108 

.092 

.528 

1.00 

.552 

1.00 

.564 

1.00 

LMH 

.648 

.852 

.120 

.088 

.640 

.276 

.788 

.424 

LM, 

.108 

.092 

.500 

.996 

.524 

.992 

.564 

.988 

LM'H 

.960 

.996 

.148 

.284 

.916 

.500 

.992 

.720 

NM 

LM] 

.100 

.096 

.540 

.984 

.536 

.992 

.548 

.992 

LMH 

.896 

.956 

.176 

.156 

.824 

.352 

.928 

.548 

LM, 

.104 

.088 

.844 

.980 

.744 

.992 

.564 

.988 

LM'H 

.588 

.844 

.108 

.092 

.556 

.264 

.772 

.404 

S(3,ll) 

LM] 

.092 

.116 

.572 

.992 

.608 

.996 

.612 

1.00 

LMH 

.604 

.848 

.108 

.124 

.588 

.324 

.784 

.496 

LM, 

.116 

.120 

.560 

.956 

.572 

.988 

.616 

.988 

LM'H 

.396 

.692 

.088 

.064 

.400 

.180 

.600 

.276 

CN 

LM] 

.104 

.112 

.524 

.992 

.560 

.988 

.548 

.992 

LMH 

.488 

.708 

.104 

.104 

.448 

.264 

.636 

.388 

LM, 

.112 

.132 

.524 

.968 

.544 

.968 

.564 

.964 
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Table  4.  Estimated  Powers  for  the  LM  Statistics 
Number  of  Observation  =100 


Disturbance 

Alternatives:  H\ 

Distributions 

HHvi) 

HI(m) 

Hl(Pi) 

HI(P2) 

WI{vi,fii) 

Hl{m,n) 

M(V2,P1) 

HI(V2,P2) 

LM*H 

.840 

.988 

.092 

.060 

.804 

.412 

.968 

.664 

N(0,25) 

LM] 

.124 

.132 

.864 

1.00 

.848 

1.00 

.848 

1.00 

LMH 

.808 

.976 

.100 

.072 

.784 

.408 

.952 

.640 

LMr 

.120 

.132 

.852 

1.00 

.848 

1.00 

.848 

1.00 

LM*H 

.688 

.916 

.080 

.040 

.636 

.300 

.876 

.492 

*s 

LM] 

.080 

.068 

.828 

1.00 

.860 

1.00 

.876 

1.00 

LMh 

.764 

.952 

.144 

.116 

.700 

.448 

.884 

.648 

LMi 

.084 

.084 

.828 

1.00 

.876 

.992 

.896 

.996 

LM'H 

.256 

.364 

.088 

.008 

.184 

.052 

.300 

.080 

log 

LM* 

.108 

.112 

.912 

1.00 

.904 

1.00 

.900 

1.00 

LMH 

.880 

.972 

.136 

.124 

.764 

.368 

.928 

.540 

LMr 

.116 

.080 

.988 

1.00 

.996 

.992 

.996 

1.00 

LM*H 

.928 

.996 

.104 

.100 

.896 

.532 

.980 

.816 

5(7,7) 

LM* 

.120 

.120 

.852 

1.00 

.832 

1.00 

.844 

1.00 

LMH 

.900 

.992 

.096 

.088 

.848 

.492 

.964 

.748 

LMr 

.116 

.120 

.848 

.996 

.828 

1.00 

.848 

1.00 

LM*H 

1.00 

1.00 

.212 

.308 

1.00 

.792 

1.00 

.944 

NM 

LM] 

.104 

.112 

.908 

1.00 

.892 

1.00 

.896 

1.00 

LMh 

.992 

.996 

.084 

.096 

.960 

.512 

.988 

.780 

LMj 

.112 

.092 

1.00 

1.00 

.984 

1.00 

.972 

1.00 

LM*H 

.856 

.984 

.080 

.052 

.792 

.444 

.968 

.712 

B(3,ll) 

LM] 

.112 

.108 

.884 

1.00 

.864 

1.00 

.876 

1.00 

LMH 

.884 

.984 

.064 

.080 

.792 

.448 

.960 

.712 

LMt 

.116 

.092 

.900 

1.00 

.880 

.996 

.880 

1.00 

LM*H 

.596 

.828 

.108 

.032 

.548 

.220 

.780 

.376 

CN 

LM] 

.068 

.072 

.760 

1.00 

.752 

1.00 

.752 

1.00 

LMH 

.648 

.896 

.120 

.108 

.588 

.340 

.848 

.496 

LMi 

.088 

.092 

.776 

1.00 

.784 

.996 

.800 

1.00 

First  we  note  that  the  estimated  powers  of  the  parametric  tests  LMH  and  LM]  are  similar  to  those 
reported  in  Bera  and  Jarque  (1982),  and  Bera  and  McKenzie  (1986).  Regarding  the  powers  of  our 

nonparametric  tests  LM  h  and  LM i,  we  observe  that  they  are  comparable  to  their  parametric  coun- 
terparts for  7V(0,25),  B(7,7),  5(3,  11)  and  NM  disturbances.  In  particular,  when  the  disturbance 
distribution  is  normal,  for  which  LM*H  and  LM]  are  designed  to  perform  best,  we  observe  very 
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little  loss  of  power  in  using  LM h  and  LM  j.  On  the  other  hand,  LM  h  substantially  outperform 
its  parametric  counterpart  when  the  disturbance  term  follows  a  lognormal  distribution.  To  see  the 

difference  between  the  performances  of  LM]j  and  LM  h,  we  consider  the  case  of  lognormal  distri- 
bution with  sample  size  50.  LM]f  has  "optimal"  power  of  .832  for  the  alternative  H Ifa)  with 
normal  disturbance.  However,  the  estimated  power  for  LM]f  reduces  to  .292  when  the  disturbance 
distribution  is  lognormal.  When  we  further  contaminate  the  data  by  strong  autocorrelation,  that 
is  under  HI(rj2,P2),  the  estimated  power  is  merely  .084,  even  less  than  the  size  of  the  test.    The 

estimated  powers  for  LM  h  for  the  above  three  situations  are  respectively  .760,  .752  and  .272.  The 
power  do  reduces  with  gradual  contamination,  but  not  as  drastically  as  that  of  LM*H.  For  the  <5  and 

CN  disturbances,  the  advantage  of  the  nonparametric  LM h  becomes  more  eminant  as  the  sample 
size  gets  bigger,  under  which  the  nonparametric  efficiency  begins  to  show  up.    Note  that  all  the 

distributions  t$,  log,  and  CN,  under  which  LM h  outperforms  LM*H,  have  thicker  tails  than  the 
normal  distribution.   The  5(7,7)  and  5(3,11)  distributions,  under  which  LM]f  is  comparable  to 

LM h,  have  thinner  tails  than  the  normal  distribution.  The  N M  distribution,  which  has  the  same 
tail  behavior  as  the  normal  distribution  does  not  deteriorate  the  power  of  LM]f  substantially  even 
though  the  distribution  of  LM]f  deviates  quite  remarkably  from  the  \2  under  Ho  as  we  noticed  in 
Figure  3.  As  we  noted  in  Figure  1,  the  thick-tails  distributions  like  <s  and  CN  have  receding  score 
in  the  tails  while  thin-tails  distributions  have  progressive  score  in  the  tails.  It  is  exactly  the  thick- 
tails  distributions  that  cause  problems  in  conventional  statistical  methods  and  it  is  these  thick-tails 
distributions  that  robust  procedures  are  trying  to  deal  with. 

The  parametric  LM] ,  however,  seems  to  be  less  sensitive  to  distributional  deviation  of  the 

innovation  and,  hence,  there  are  no  drastic  differences  between  LM]  and  LM i  even  for  severe 
departures  from  the  normal  distribution  such  as  under  t$  log,  and  CN . 

As  was  indicated  above,  both  the  LM]f  and  LM h  statistics  for  testing  heteroskedasticity  are 
not  robust  to  misspecifications  in  serial  independence.  The  power  of  both  tests  drop  when  there 
are  severe  serial  correlations  present  in  the  disturbances.  The  effect  of  serial  correlation  is,  however, 
more  serious  for  LM]j.  For  instance,  when  the  sample  size  is  100  and  the  distribution  is  t$,  estimated 
power  of  LM'H  reduces  by  .424  (=    .916   —    .492)  as  we  move  from  Hlfa)  to  HI(rj2,p2)-  On  the 

other  hand,  for  LM h  the  power  loss  is  .304  (=    .952  —    .648).  This  pattern  is  observed  for  almost 

all  distributions.  The  powers  of  LM]  and  LM i  are,  however,  more  robust  to  violation  on  the 
maintained  assumption  of  homoskedasticity.  This  is  easily  seen  by  looking  at  the  powers  of  LM] 

and  LM  j  under  three  sets  of  alternatives:  (i)  HI(pi)  and  HI(p2);  (ii)  HI(rji,pi)  and  H  I{r)\,p2), 
and  (iii)  HI(t)2,p\)  and  H  I{r)2,  p2)-  Nevertheless,  this  suggests  that  some  join  tests  or  Multiple 
Comparison  Procedure  in  the  same  spirit  of  Bera  and  Jarque  (1982)  will  be  able  to  make  our  tests 
for  heteroskedasticity  more  robust  to  violation  on  the  maintained  serial  independence  assumption. 
Furthermore  by  adopting  a  nonparametric  conditional  mean  instead  of  the  linear  conditional  mean 
model  [see  e.g.  Lee  (1992)]  or  even  using  a  nonparametric  conditional  median  specification  [see 
e.g.  Koenker  and  Ng  (1992)]  will  further  make  our  test  statistics  robust  to  misspecification  on  the 
conditional  structural  model.  These  extensions  will  be  reported  in  future  work. 
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Our  simulation  results  indicate  that  the  distribution  of  our  nonparametric  LM  statistic  for  test- 
ing heteroskedasticity  are  closer  to  the  asymptotic  \2  distribution  under  homoskedasticity  and  serial 
independence  for  all  distributions  under  investigation  than  its  parametric  counterpart.  The  para- 
metric LM  statistic  for  testing  autocorrelation  is,  nevertheless,  much  less  sensitive  to  departure 
from  the  normality  assumption  and  hence  fares  as  good  as  its  nonparametric  counterpart.  The 
estimated  probabilities  of  Type  I  Error  for  the  nonparametric  LM  statistics  for  testing  both  het- 
eroskedasticity and  autocorrelation  are  also  much  closer  to  the  nominal  10%  value.  The  superiority 
of  our  nonparametric  LM  test  for  heteroskedasticity  becomes  more  prominent  as  the  sample  size 
increases  and  as  the  severity  of  the  departure  (measured  roughly  by  the  thickness  in  the  tails  )  from 
normality  increases.  Therefore,  we  may  conclude  that  our  nonparametric  test  statistics  are  robust 
to  distributional  misspecification  and  will  be  useful  in  empirical  work. 
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Figure  1  Score  Functions  of  Various  Distributions 
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